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Abstract 

Background: Parent-of-origin-dependent expression of alleles, imprinting, has been suggested to impact a 
substantial proportion of mammalian genes. Its discovery requires allele-specific detection of expressed transcripts, 
but in some cases detected allelic expression bias has been interpreted as imprinting without demonstrating 
compatible transmission patterns and excluding heritable variation. Therefore, we utilized a genome-wide tool 
exploiting high density genotyping arrays in parallel measurements of genotypes in RNA and DNA to determine 
allelic expression across the transcriptome in lymphoblastoid cell lines (LCLs) and skin fibroblasts derived from 
families. 

Results: We were able to validate 43% of imprinted genes with previous demonstration of compatible 
transmission patterns in LCLs and fibroblasts. In contrast, we only validated 8% of genes suggested to be imprinted 
in the literature, but without clear evidence of parent-of-origin-determined expression. We also detected five novel 
imprinted genes and delineated regions of imprinted expression surrounding annotated imprinted genes. More 
subtle parent-of-origin-dependent expression, or partial imprinting, could be verified in four genes. Despite higher 
prevalence of monoallelic expression, immortalized LCLs showed consistent imprinting in fewer loci than primary 
cells. Random monoallelic expression has previously been observed in LCLs and we show that random monoallelic 
expression in LCLs can be partly explained by aberrant methylation in the genome. 

Conclusions: Our results indicate that widespread parent-of-origin-dependent expression observed recently in 
rodents is unlikely to be captured by assessment of human cells derived from adult tissues where genome-wide 
assessment of both primary and immortalized cells yields few new imprinted loci. 



Background 

Most mammalian autosomal genes are thought to be 
expressed co-dominantly from the two parental chromo- 
somes. At some loci, the allele inherited from one par- 
ent is suppressed through epigenetic mechanisms. This 
monoallelic expression, referred to as imprinting, leads 
to genetic vulnerability that can contribute to rare 
monogenic syndromes, such as Angelman and Prader- 
Willi syndromes [1]. Recent evidence suggests that com- 
mon disease, such as basal-cell carcinoma and type 2 
diabetes, can also be impacted by parent-of-origin- 
specific allelic variants [2]. Classical imprinting of a 
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region is the result of expression of only one parental 
allele, where the other allele is completely suppressed. 
However, a more subtle imprinting effect has been 
recently reported where both alleles are differently 
expressed and show this in a parent-of-origin-dependent 
manner. This deviation of typical imprinting is called 
partial imprinting [3]. 

Although there is no global explanation for the role of 
imprinting in mammalian development and physiology, 
a parental conflict over the distribution of resources to 
offspring theory has been hypothesized [4], and reviewed 
in [5]. When maternal and paternal input in the off- 
spring is unequal, a differing evolutionary pressure is 
placed on the alleles inherited from one or the other 
parent, where the maternally derived allele acts to 
decrease maternal contribution to the fetus and the 
paternally derived allele acts to increase maternal 
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contribution [4], Imprinted genes have been shown to 
be very important in fetal, placental and brain develop- 
ment, postnatal growth, behavior and metabolism [6]. 
However, since not all imprinted genes are involved in 
development or growth and imprinting, they have likely 
evolved more than once [7]. 

The debate around theories of imprinting parallels the 
intense investigation of the mechanisms that maintain 
imprinting. Monoallelic expression can be achieved with 
mechanisms such as CpG island methylation, histone 
modifications, antisense transcript-associated silencing, 
as well as by long-range chromatin effects [8]. However, 
such allele-specific phenomena are not restricted to 
imprinted genes [9] and not all of these mechanisms 
can be found in every imprinted locus. Because of this, 
studies looking at individual attributes of chromatin 
structure without correlation to gene expression may 
not be efficient in uncovering imprinted genes [10]. 

Although there are several genomic parameters that 
seem to distinguish imprinted and non-imprinted genes 
(smaller introns, repeat sequences), which have been 
exploited in attempts to bioinformatically predict mam- 
malian imprinted genes [11,12], these characteristics are 
not found in all imprinted genes. A feature of these pre- 
dictions is the generation of a large number of poten- 
tially imprinted genes; for example, one study predicted 
600 imprinted genes [13] while another predicted that 
there may be over 2,000 imprinted genes [14]. Yet, few 
of these bioinformatic predictions have been validated 
[15], leading many to believe that the numbers are lar- 
gely inflated and that the number of imprinted genes 
yet to be identified is small [9]. More conservative esti- 
mates assume 100 to 200 imprinted genes in the human 
genome [16]. 

So far, direct observation of mammalian imprinting in 
living cells and tissues has been carried out most thor- 
oughly in the mouse genome using RNA-seq [17,18]. 
These studies employed the gold standard for recogniz- 
ing imprinting in mice using the non-equivalence of 
monoallelic expression in reciprocal matings of inbred 
strains but yielded widely different estimates of amounts 
of imprinted genes in mouse embryonic brain. Using 
three brain regions, up to 1,300 transcripts were 
reported as imprinted [18], whereas a single brain region 
studied for 5,000 genes observed only a handful of novel 
imprinted genes beyond the more than 100 validated 
earlier [17]. Criteria for calling imprinting allowed for 
partial and inconsistent parent-of-origin-dependent 
expression within transcripts and between individuals 
and along with shown tissue specificity [18] may, in 
part, explain the substantial discrepancy between the 
two studies. The reciprocal mating approach used with 
mice cannot be used with humans. Consequently, 
demonstration of imprinting requires family-based tissue 



samples as well as accurate methods to observe differen- 
tial expression of parental alleles. An obvious limitation 
to human studies is the access to multiple tissue types 
where transmission patterns can be determined. This 
leads to some genes being reported as imprinted with- 
out clear demonstration of allelic expression (AE) bias 
[19] and/or parental bias [20-22]. Because of these lim- 
itations, it is unclear what the extent of imprinting is in 
humans. Currently, direct assessment of imprinting in 
human tissues has yielded approximately 80 genes with 
varying degrees of evidence for imprinting [23] and an 
up to date catalogue is kept at the Catalogue of Parent 
of Origin Effects [24]. Some of the imprinted genes have 
been found to be tissue- or developmental stage-specific 
[7]. Given the limitations in sampling as well as measur- 
ing differential expression of parental alleles comprehen- 
sively, it is commonly assumed that the number could 
be significantly higher. 

In addition to imprinting, random monoallelic expres- 
sion (RME) has been reported as a source of sequence- 
independent AE [25]. When RME occurs at a given 
locus, a range of expression can follow such that some 
cells express only the maternal allele, some cells express 
only the paternal allele and some cells express a combi- 
nation of the two. This class of genes has been previously 
reported in the odorant receptor genes as well as genes 
encoding immunoglobulins, T-cell receptors, interleukins, 
and natural killer cell receptors [26-30]. Historically, 
RME was linked to a subset of genes involved in the 
immune or nervous system. However, Gimelbrant et al. 
[25] assessed 3,939 genes in multiple clonal lymphoblast 
cell lines (LCLs) and found that roughly 10% were mono- 
allelically expressed and observed a large diversity in 
RME genes. In their study, different cell clones derived 
from the same individual showed biallelic behavior at 
most loci. Other studies have established links between 
allele-specific DNA methylation and RME [31]. In an 
earlier study of ours, we observed an excess of high- 
magnitude AE in immortalized lymphoblasts (LCL) com- 
pared to primary cells (osteoblasts and fibroblasts) and 
this correlated with the estimated levels of clonality [32]. 
It has been hypothesized that aberrant methylation 
induced by lymphoblast immortalization, prolonged cell 
culture or multiple passages may be a possible mechan- 
ism for the observed AE [33]. In this study, we utilize a 
genome-wide method [32] to determine strongly biased 
AE in the transcriptome using family-based cell panels 
from two cell types (lymphoblasts and primary fibro- 
blasts). Using this method, we aim to uncover imprinting 
in the human genome by determining parent-of-origin 
transmission in multiple pedigrees as well as excluding 
heritable variants that cause monoallelic expression 
through population-based data obtained from these same 
samples. To globally assess the relationship between 
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methylation and RME, we perturbed the methylation 
state in lymphoblasts using 5-azadeoxycytidine (AZA), a 
drug that causes hemi-demethylation, and monitored 
changes in AE upon demethylation. The density of mea- 
surements, inclusion of family- and population-based AE 
from two cell types along with an investigation of methy- 
lation impact on differential AE provides the most com- 
prehensive survey of epigenetic ds-regulatory variation in 
the human genome to date. 

Results 

Validated imprinting in lymphoblast cell lines and 
fibroblasts 

First, we assessed the level of evidence for non-overlapping 
genes suggested to be imprinted (Catalogue of Parent of 
Origin Effects [24]), specifically looking for demonstration 
of monoallelic expression with parent-of-origin-specific 
transmission in at least one pedigree. For genes with con- 
sistent parent-of-origin transmission, our search yielded a 
total of 44 imprinted genes. We were able to assess 73% of 
the confirmed imprinted genes (32 of 44) in either lym- 
phoblasts or fibroblasts (Table 1; Table SI in Additional 
file 1), as 12 loci were uninformative in our analysis (Table 
S2 in Additional file 1). The degree of allelic bias was 
extracted from the Illumina 1M AE assay [GEO: 
GSE26286] essentially as previously described [32]. 

To validate the allelic expression calls from the Illu- 
mina 1M assay, we tested 15 SNPs from putative 



imprinted loci in 63 samples using a normalized Sanger 
sequencing-based validation assay [34]. One SNP gave 
discrepant genotyping calls and was excluded from the 
analysis, leaving 14 SNPs and 61 samples for compari- 
son (Table S3 in Additional file 1). The analysis shows a 
concordant expression bias towards the expected allele 
in all cases with Pearson correlation coefficient of r = 
0.9657 (Additional file 2). 

The parent-of-origin-dependent transmission of allelic 
biases was confirmed in lymphoblasts using a three-gen- 
eration pedigree of Caucasian origin (CEPH family 
1420) [32] along with newly generated AE profiles in a 
Caucasian as well as a Yoruban parent-offspring trio. 
We also used nine independent parent-offspring fibro- 
blast trios to confirm parental influence in AE. Of the 
known imprinted genes that were assessed, 37.5% (12 of 
32) showed monoallelic expression and clear parental 
bias in either both tissues or in only one tissue if the 
other could not be assessed (Figure la and Table 1). 
Seven of these have been previously validated in LCLs 
by independent PCR-based AE measurements in a sec- 
ond pedigree (CEPH family 1444) [32]. An additional 
22% (7 of 32) showed predominantly biallelic expression 
(average fold-difference between alleles < 2-fold) in one 
tissue with large magnitude AE and clear parental bias 
in the other tissue (Figure lb and Table 1). For these 19 
imprinted genes, the average increased expression of the 
overexpressed allele was 7.39-fold (2.94 to 11.84, 1 



Table 1 Validated imprinted genes in the human genome 



Location Gene Transcript Human Mouse Expressed allele LCL FB 



6q24 


PLAGLf 


NM_00 1080952 




P 


No 


Yes 


7q21 


5GCE 


NM_00 1099401 


I I 


P 


Yes 


No 


7q21 


PEG 10 


NM_0 15068 


I I 


P 


NA 


Yes 


7q32 


CPA4 


NM_016352 


I NR 


M 


No 


Yes 


7q32 


MEST 


NM_1 77524 


I 1 


P 


No 


Yes 


7q32 


C0PG2 


NM_012133 


CD 1 


P 


No 


Yes 


7q32 


KLF14 


NM_1 38693 




M 


NA 


Yes 


1 1 pi 5 


H19 


NR_002196 




M 


No 


Yes 


1 1 pi 5 


KCNQ1 


NM_000218 


1 1 


M 


Yes 


NA 


14q32 


MEG3 


NR_002766 


1 1 


M 


No 


Yes 


15q11 


MKRN3 


NM_005664 


1 1 


P 


NA 


Yes 


15q11 


MAGEL2 


NM_0 19066 


1 1 


P 


NA 


Yes 


15q11 


NDN 


NM_002487 


1 1 


P 


NA 


Yes 


15q1l 


5NURF 


NM_005678 


1 1 


P 


Yes 


Yes 


15q1l 


IPW 


NR_023915 




P 


Yes 


Yes 


1 6p1 3 


ZNF597 


NM_1 52457 


1 NR 


M 


Yes 


Yes 


1 9q1 3 


ZNF331 


NM_001 079906 


1 NR 


P 


Yes 


Yes 


1 9q1 3 


ZIM2 


NM_015363 


1 1 


P 


No 


Yes 


20ql3 


GNAS/GNASAS 


NR_002785 


1 1 


M 


Yes 


Yes 


20ql3 


L3MBTL 


NM_032107 


1 NR 


P 


Yes 


Yes 



a Only PLAGL1 isoform 1 is found expressed and imprinted in the fibroblasts; isoforms 1 and 2 are biallelically expressed in the LCLs. CD, conflicting evidence as 
defined by Morrison et al. [19]; FB, fibroblast cell lines; I, imprinted genes with previously observed parent-of-origin-dependent expression bias; LCL, lymphoblast 
cell lines; M, maternal; NA, not available (not expressed or non-informative in children); NR, not reported; P, paternal. 
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Figure 1 Examples of imprinted genes in Human genome 

(a) Imprinted genes in both lymphoblasts and fibroblasts: GNAS is 
an example of an imprinted gene that has been previously 
described in the literature and has been confirmed in our study as 
well, (b) Imprinted genes in fibroblasts only: PlAGLl is an example 
of tissue-specific imprinting (isoform 1). (c) Novel imprinted genes: 
ZDBF2 is an example of a novel imprinted gene. In each case, the 
figure shows all of the informative pedigrees. For the trios, the 
colors indicate the paternal allele (blue) and the maternal allele 
(red). For the three-generation pedigree the colors indicate which 
parental allele is inherited. The bars indicate which allele is 
overexpressed as well as the degree of overexpression. 



standard deviation (SD)). The remaining genes (13 of 32; 
40%) all showed biallelic expression in all available mea- 
surements (Table SI in Additional file 1). Overall, out of 
the 32 imprinted genes, we discovered that the AE 
observed for the genes PRIM2, CPA4, and DLGAP2 in 
LCLs was found to be associated with genotypes at local 
SNPs, consistent with heritable rather than imprinted 
allelic expression. Interestingly, the extreme AE 
observed for the CPA4 gene, although heritable in LCLs, 
is found to be consistent with imprinting in the 
fibroblasts. 

Second, we looked for suggested imprinted genes 
(Catalogue of Parent of Origin Effects [24]), but with 
inconsistent parent-of-origin transmission data in the 
literature. Our search yielded 13 genes (marked 'PD/CD' 
in the tables), of which 69% (9 of 13) could be assessed. 
Only the gene COPG2 was validated for imprinting in 
the fibroblasts (Table 1) but was found to heritable in 
LCLs (data not shown). All of the remaining eight genes 
were found to be biallelic in lymphoblasts and/or fibro- 
blasts (Table SI in Additional file 1) and the AE 
observed for the genes ZNF215 and GABRG3 was found 
to be heritable in both cell types (data not shown). 

Novel imprinted genes and genomic regions 

Using AE patterns observed for validated imprinted 
genes, which showed at least 2.9-fold difference in 
expression (-1 SD for confirmed imprinted genes), we 
sought evidence for imprinting among annotated genes 
and unannotated transcripts. We required that at least 
three consecutive SNPs showed an average deviation in 
excess of a 2.9-fold threshold and were measured in at 
least two children. Altogether, out of the 223,017 win- 
dows measured in at least two children, 1,253 fulfilled 
the criteria in the three-generation LCL pedigree, and of 
the 234,837 windows measured in the fibroblasts, a total 
of 549 were showing high AE. These candidate windows 
fell into 254 distinct loci in LCLs and into 110 loci in 
fibroblasts (Tables S5 and S6 in Additional file 3). Six of 
these loci in LCLs (spanning 8 genes) and 15 loci in 
fibroblasts (spanning 19 genes) had earlier literature evi- 
dence and were included in the assessment of known 
loci above. Our analysis revealed five imprinted RefSeq 
annotated genes not reported by other methods in 
humans (Table 2, Figure lc). The genes ZDBF2 and 
SGK2 were found imprinted in LCLs, while the genes 
NAT15, RTL1 and MEG8 were found imprinted in 
fibroblasts. Three of these novel imprinted human genes 
had previously been identified in mice (ZDBF2, RTL1, 
MEGS) [35-37]. We note that in the fibroblasts, none of 
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Table 2 Novel imprinted genes found in lymphoblasts and/or fibroblasts 



LCL FB 



Location 


Gene 


Mouse 


Expressed allele 


Number of ITs 


AE (average magnitude) 


Number of ITs 


AE (average magnitude) 


2q33 


ZDBF2 


I 


P 


9 


12.06 


NA 


NA 


16p13 


NAT15 


NR 


M 


NA 


NA 


3 


6.95 


20q13 


SGK2 


NR 


P 


9 


8.9 


NA 


NA 


14q32 


RTL1 




P 


NA 


NA 


<1 


12.34 


14q32 


MEG8 




M 


NA 


NA 


8 


10.66 



AE, allelic expression; FB, fibroblast cell lines; I, imprinted genes with previously observed parent-of-origin-dependent expression bias; IT, informative transmission; 
LCL, lymphoblast cell lines; M, maternal; NA, not available; NR, not reported; P, paternal. 



the regions overlapping RefSeq annotation and demon- 
strating potentially parent-of-origin-based transmission 
showed positive population mapping data (n = 15) 
whereas 36% (4 out of 11) for LCLs showed links with 
common variants in mapping data (Tables S5 and S6 in 
Additional file 3). 

Since transcription was measured across the genome, we 
were able to observe potentially imprinted expression of 
ten unannotated intergenic regions (Table 3; Additional 
file 4). Four of these ten regions showed strong evidence 
for imprinting while the remaining six were found to be 
consistent with heritable AE. In some cases (n = 3) the 
imprinting regions spanned two to three genes and mea- 
sured between 73,150 and 1,569,064 bases (Figure 2). 
We also commonly encountered imprinted transcription 
of SNPs outside the boundaries of annotated imprinted 
genes. For example, 10 of the 20 RefSeq genes showing 
strong evidence of imprinting continued this strong 
imprinted expression outside of the annotated gene 
boundary. Surprisingly, seven of these ten cases showed 
imprinted expression 5 kb away from the transcript, 
suggesting that they may represent independent tran- 
scriptional units or unannotated isoforms of the 
imprinted genes. 

Partial imprinting 

We have previously shown that immortalized LCLs 
demonstrate an excess of monoallelic expression, 



putatively due to rare RME events detectable in these 
lines [32]. To avoid such biases, we looked for moderate 
magnitude AE (2- to 2.9 fold average difference among 
all informative heterozygotes) in loci where at least two 
of the children of the nine fibroblast trios were hetero- 
zygous to uncover partial imprinting. To avoid redun- 
dancy, we excluded AE at boundaries of classically 
imprinted regions (as defined in the above sections). 
Out of the 234,837 windows measured, we identified 46 
loci that showed this degree of allelic bias. Of these, 30 
could be determined to be consistent with heritable AE, 
mappable to local polymorphisms; in 80% of cases (24 
of 30) the mapped polymorphism was transmitted in a 
Mendelian fashion (the remaining 6 were not informa- 
tive for transmission of the putative regulatory variant). 
The remaining 16 RefSeq genes did not show associa- 
tion with common SNPs and were further investigated 
for change of relatively overexpressed haplotype with 
transmission (indicative of non-genetic effect) and par- 
ental bias in pedigrees. Four of the 16 showed strong 
evidence for partial imprinting, with the father's allele 
being preferentially expressed {TRAPPC9, ADAM23, 
CHD7, TTPA; Additional file 4). 

Mechanisms for random allelic expression 

In order to assess the basis of extreme non-imprinted, 
non-heritable AE observed in lymphoblasts, three LCLs 
were treated with the demethylating agent AZA and 



Table 3 Novel candidate imprinted intergenic regions in lymphoblasts and fibroblasts 



Chromosome 


Start 


End 


LCL AE (average magnitude) 


FB AE (average magnitude) 


Heritable AE 


1 


210509341 


210524037 


4.23 


Nl 


Yes 


I 


72584492 


72610078 


Nl 


2.94 


Yes 


2 


187422507 


187893532 


Ml 


2.68 


No 


7 


26113744 


26137739 


4.36 


Nl 


Yes 


12 


9514883 


9649634 


4.24 


Nl 


Yes 


14 a 


100425763 


100608884 


Nl 


8.58 


No 


15 b 


22786809 


22902119 


10.18 


7.55 


No 


16 


54019260 


54035547 


8.54 


Nl 


Yes 


16 


3355563 


3366918 


Nl 


2.78 


No 


17 


41604896 


41620711 


Nl 


5.87 


Yes 



a Downstream of MEG3 and RTL1. b Within SNRPN/SNURF region. AE, allelic expression; FB, fibroblast; LCL, lymphoblast cell line; Nl, not informative. 
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Figure 2 Examples of imprinted genomic regions in fibroblasts, (a) Paternally expressed imprinted region on chr14 covering numerous 
non-RefSeq genes found downstream of the paternally imprinted DLK1 gene (was not informative in our samples). This region has been 
previously identified in mice and sheep, (b) Extension of imprinting with paternal expression downstream of the SNRPN/SNURF loci 
encompassing multiple non-RefSeq genes, (c) Maternally expressed imprinted gene ZNF597 with upstream imprinted isoform-specific NAT15. 
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were observed for changes in AE upon treatment. The 
three cell lines were selected based on our earlier data 
indicating high levels of clonality in these particular cell 
lines [32] based on extreme deviation from random 
X-inactivation. Using 5 uM AZA for 3 days, we 
observed a significant decrease in AE in 20% of loci that 
showed at least a two-fold difference in AE at baseline 
(defined as an allelic change of at least 1.25-fold, the 
95th percentile of allelic fold change among untreated 
biological controls). Only one of the imprinted loci 
showed a change in AE upon treatment (GNAS). Simi- 
larly, loci where the AE could be mapped to common 
SNPs [32] were underrepresented: 23% (7 of 30) of AE 
traits affected by treatment mapped to SNPs (Table 4), 
whereas 35% (17 of 48) of loci without significant treat- 
ment effect on AE showed association with local SNPs 
(Table 5). These observations suggest that the demethy- 
lation alters the expression of randomly silenced genes 
in lymphoblasts. We studied this further by observing 
concordance of AE for identical-by-descent (IBD) sib- 
lings in a three-generation pedigree (CEPH 1420). We 
reasoned that if demethylation primarily affects random 



allelic silencing, then loci demonstrating treatment- 
specific effects would also more likely show random or 
IBD-independent AE since heritable or imprinted loci 
should demonstrate consistent AE. IBD siblings were 
considered concordant for AE if both had the same 
allele overexpressed and showed over 1.5-fold difference 
between alleles. They were considered discordant if one 
sibling showed 1.5-fold overexpression and the other 
sibling was either biallelic or overexpressed the other 
allele. The IBD sibling analysis showed discordant AE in 
30% of transmissions for loci affected by treatment but 
only in 1% of loci not altered by treatment (P-value = 
0.00308; Table 6). This suggests that RME, which is 
detectable in lymphoblasts due to their reduced mosai- 
cism [32], may be partly explained by aberrant methyla- 
tion in the genome and this effect can be partially 
reversed by demethylation treatment. To confirm these 
results, an independent cell line was treated with 10 uM 
of AZA for 5 and 10 days. At the 10-day time-point, 61 
of 155 allelically expressed loci (more than a two-fold 
difference in untreated) showed a 50% decrease in mag- 
nitude of AE upon treatment and no loci showed an 



Table 4 Genes affected by AZA treatment 

19099 19141 



Gene 


Transcript 


Location 


Untreated 


Treated 


Untreated 


Treated 


IBD 


Mapped to polymorphism 


PCTK3 


NM_212502 


chrl :203742262-203768466 


2.54 


1.42 


2.74 


1.24 


Yes 




CR1L 


NM_1 75710 


chr1:205886352-205961039 


2.21 


1.38 


2.20 


1.49 


Yes 




KCNK1 


NM_002245 


chrl :231 822688-231 871 795 


1.78 


1 . 1 I 


4.69 


2.59 


Ml 


Yes 






chr4:79778447-79803457 


2.47 


1.78 


2.1 


1.27 


MA 


Yes 






chr5:1 731 0061 3-1 731 3991 7 


2.33 


1.27 


2.08 


1.15 


Yes 








chr5:9599989-9600708 


3.02 


1.58 


1.47 


1.06 


Yes 








chr6:1 39658229-1 3973391 5 


3.16 


2.36 


3.53 


2.53 


Yes 


Yes 






chr6:8001 6628-80042343 


3.16 


2.36 


2.71 


1.25 


NA 




CALN1 


NM_001017440 


chr7:71 159735-71207121 


1.51 


1.05 


5.76 


1.76 


MA 








chrl 1:3036678-3063235 


4.64 


2.94 


4.63 


3.18 


Yes 


Yes 


5YT9 


NM_1 75733 


chrl 1:7376868-7440901 


2.09 


1.31 


3.48 


2.15 


Yes 


Yes 


VWA5A 


NM_001 130142 


chrl 1:1 23521 934-1 23522703 


2.02 


1.37 


2.12 


1.37 


Yes 




P2RX7 


NM_002562 


chrl 2:1 20055848-1 20087505 


2.23 


1.08 


1.81 


1.24 


Yes 




COL4A2 


NM_001846 


chrl 3:1 09958305-1 09963202 


3.25 


1.5 


2.58 


1.84 


Yes 




PRKCH 


NM_006255 


chr14:60959560-61030659 


2.17 


1.64 


1.92 


1.43 


Yes 




DNAJA4 


NM_00130182 


chrl 5:76345564-76360674 


2.15 


1.58 


8.56 


5.77 


Yes 








chrl 5:94684325-9471 1444 


3.11 


2.13 


5.45 


3.28 


Yes 




GAS7 


NM_001 130831 


chrl 7:1 0022884-1 0022981 


4.24 


2.21 


2.13 


1 ,-1-1 


Yes 


Yes 






chr17:14190861-14192673 


2.37 


1.68 


2.34 


1.60 


N! 


Yes 


C20orf194 


NM_001 009984 


chr20:31 791 34-3334482 


3.47 


2.08 


1.92 


1.45 


Yes 








chr20:46050433-46119516 


2.22 


1.54 


2.48 


1.65 


MA 








chr20:46358273-46404570 


2.06 


1.27 


3.15 


1.59 


MA 


Yes 


GNAS 


NR_002875 


chr20:56848505-56882141 


6.46 


4.14 


3.62 


2.06 


Yes 




TMPRSS3 


NM_024022 


chr21:42665938-42688945 


1.47 


1.04 


4.19 


1.43 


Yes 








chr22:28762403-28805154 


1.43 


1.03 


4.25 


2.23 


Yes 




OSBP2 


NM_030758 


chr22:295981 29-29633708 


2.41 


1.84 


2.86 


1.71 


Ml 





IBD, identical-by-descent; NA, not available; Nl, not informative. 
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Table 5 Genes not affected by treatment 

19099 19141 



Gene 


Transcript 


Location 


untreated 


Treated 


Untreated 


Treated 


ion 
IdD 


Mapped to polymorphism 


MARKI 


NM_018650 


chrl :21 8867493-21 890061 3 






6.38 


6.26 


Yes 




DISCI 


NMJ301012957 


chrl 229837731-230086433 






4.64 


4.85 


Yes 


Yes 


CYP27A 1 


NM_000784 


ch r2:2 1 93 72907-2 1 9379842 


3.84 


2.54 


1.25 


1.62 


Ml 




THNSL2 


NM_018271 


chr2:8825291 1-88265923 


2.43 


3.37 






Yes 


Yes 


PTPRG 


NM_002841 


chr3:62 165281 -62250653 


4.02 


3.09 


3.65 


5.10 


Ml 




UPKIB 


NM_006952 


chr3:1 20375223-1 2039931 7 


4.12 


3.68 


1.23 


1.33 


Ml 




FAM53A 


NM_001013622 


chr4:1 654935-1 655009 


1.93 


2.04 


2.30 


2.13 


Yes 




EVC 


NMJ53717 


chr4:5767823-5801057 


1.68 


1.78 


3.47 


3.1 1 


Yes 








ch r4:6698225-6722860 






3.9 


3.99 


Yes 


Yes 






chr4:1 0701 1829-107032181 






4.34 


4.27 


N 








chr4:1 425291 92-1 42768065 


5.1 1 


5.47 






Yes 


Yes 


ANKH 


NM_054027 


chr5:14801 236-14922709 


1.42 


1.35 


7.59 


8.63 


Yes 


Yes 






chr5:82347320-82386566 






2.1 


2.23 


Yes 


Yes 






chr6:654765-656792 


3.18 


3.13 






NA 




MOXDI 


NM_015529 


chr6:1 326591 62-1 32759924 


2.02 


2.08 


3.07 


3.03 


Yes 


Yes 






chr8:51 1306-580861 


3.54 


2.76 


2.33 


2.79 


Yes 


Yes 






chr9:5296824-5301 171 






3.83 


3.91 


Yes 




DEO 


NM_017418 


chr9:1 1 7025707-1 1 7204395 


2.22 


2.18 






Yes 


Yes 


DIP2C 


NM_0 14974 


chrl 0:363048-477973 


3.56 


2.48 


1.25 


1.73 


Yes 




FRMD4A 


NM_018027 


chrl 0:1 381 7200-1 41 06528 






3.93 


4.13 


NA 








chrl 1 :6879025-6898447 


2.81 


2.20 


1.63 


2.15 


Yes 


Yes 






chrl 1 :701 87425-70240934 


2.73 


2.13 


1.52 


1.83 


N 








chrl 3:1 8766583-1 8804422 


3.50 


3.03 


3.52 


4.30 


Yes 


Yes 


WDR5IB 


NM_1 72240 


chrl 2:8841 5605-88431 297 


2.18 


2.27 


2.66 


2.62 


Yes 


Yes 






chrl 4:24047434-24096337 


2.44 


2.01 


4.27 


5.26 


N 




PAX9 


NM_006194 


chrl 4:361 98687-3621 6226 


1.98 


1.51 


2.06 


2.63 


Yes 








chrl 4:69730226-6974641 4 






5.76 


5.81 


NA 




DPF3 


NM_0 12074 


chrl 4:72343297-72429399 


1.83 


2.39 


3.30 


2.29 


Yes 




WARS 


NM_1 73701 


chrl 4:999061 06-9991 1812 


2.49 


2.41 






Yes 


Yes 






chrl 5:22775434-22933834 


1 1.24 


9.95 


7.06 


8.26 


Yes 








chrl 5:28921 438-28971 039 


1 1.73 


7.79 


2.13 


2.70 


Yes 


Yes 


SV2B 


NM_0 14848 


chrl 5:8961 4733-89637888 


2.88 


2.74 






Nl 








chrl 6:53974720-54069307 


5.38 


2.88 


1.10 


1.64 


Yes 


Yes 






chrl 6:831 52950-831 55553 






3.52 


3.6 


NA 




SLCI3A5 


InIVI I / / DDL! 


CI II 1 / .OJJ 1 / y 1 \DDDD\J 1 Z 






D.I D 


3 89 


MA 








chrl 7:34566422-34580691 


2.65 


2.28 


1.56 


1.78 


NA 




PITPNCI 


NM_181671 


chrl 7:63031 387-63046267 


3.08 


2.46 


1.59 


1.83 


Yes 




DSC3 


NM_024423 


chrl 8:26824546-26875293 


1.88 


2.21 


4.83 


3.67 


Yes 


Yes 


KATNAL2 


NM_031303 


chrl 8:42780796-4281 2910 


1.97 


1.83 


2.41 


2.60 


Nl 








chrl 9:40008284-40033757 


6.05 


5.53 


4.29 


4.95 


Nl 




SIGLEC5 


NM_003830 


chrl 9:56807457-56823545 


1.45 


1.24 


3.84 


5.51 


Nl 








chrl 9:58776466-58798723 


3.02 


3.32 






Yes 








chr22:22567862-22619365 


1.12 


1.03 


8.09 


10.02 


Yes 


Yes 


LDOCIL 


NM_032287 


chr22:43268050-43270537 


1.46 


1.75 


3.74 


2.97 


Yes 





IBD, identical-by-descent; NA, not available; Nl, not informative. 



Table 6 Allelic expression observed in identical-by-descent siblings 



Condition 


Number of loci 


Concordant AE in independent IBD pairs 


Discordant AE in independent IBD pairs 


AE altered by AZA 


26 


32 


14 


AE not altered by AZA 


48 


67 


7 



AE, allelic expression; AZA, 5-azadeoxycytidine; IBD, identical-by-descent. 
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opposite effect (that is, there was a 50% increase in AE 
upon treatment). Of the loci strongly affected by the 
treatment, 95% (58 of 61) showed consistent time 
dependency of treatment (at 5 days the magnitude 
change in AE was less marked). The directionality and 
time dependence of the treatment suggest that changes 
in AE were specific to AZA treatment. To further verify 
that demethylation was occurring, we incubated frag- 
mented DNA with His-MBD2b, a methyl binding pro- 
tein that has a high affinity for CpG methylated DNA. 
We then removed the non-tagged DNA, leaving only 
methylated fragments. Comparing the signal intensities 
(XY raw signals from 1M Illumina BeadChip) in DNA 
between the treated and untreated samples after the 
methyl binding protein affinity assay shows that, for 
sites where XY raw signal significantly differs (> 1 SD 
difference) between treated and untreated samples, the 
direction of effect is predominantly towards a decrease 
of signal intensities in treated cells, suggesting that AZA 
treatment did in fact reduce global methylation in LCLs. 

Discussion 

Our work demonstrates that many allelic expression 
events previously suggested to be caused by imprinting 
failed to validate in two human cell types, which allowed 
the detection of 59% of imprinted genes with stronger 
a priori evidence of parental expression bias and only 
8% of imprinted genes with conflicting evidence of par- 
ental expression bias. These numbers suggest that cau- 
tion is needed when experimentally assessing imprinting 
in the human genome. We note that while the tran- 
scriptome coverage is high (approximately 50% of 
RefSeq genes per tissue) using our methods, a limitation 
to the allelic expression mapping using primary tran- 
scripts is non-strand specificity; therefore, if antisense 
imprinting or imprinting of intragenic transcripts is 
common, we would underestimate the prevalence of 
imprinting. On the other hand, assessment of not com- 
monly analyzed unannotated regions revealed few addi- 
tional targets with potential imprinting. In addition to 
unannotated regions, our study included five-fold higher 
coverage for annotated genes than a previous allele-spe- 
cific expression study [9] carried out in cells of lym- 
phoid origin. Consequently, the coverage for validated 
imprinted genes was over five-fold higher for the LCLs 
in our study. Pollard et al. [9] assayed AE in 2,625 genes 
and only three of these were previously known to be 
imprinted. 

In summary, we validated 20 genes out of the 41 
genes we were able to assess for imprinting. Six genes 
were found imprinted in both LCLs and fibroblasts 
(SNURF, IPW, ZNF597, ZNF331, GNAS/GNASAS and 
L3MBTL). Most of the validated genes were found to be 
tissue-specific: SGCE and KCNQ1 were imprinted only 



in the LCLs while the other genes were imprinted only 
in the fibroblasts. Interestingly, 90% of the previously 
identified imprinted genes (18 of 20) validated in this 
study were imprinted in the primary fibroblasts as 
opposed to only 40% for the immortalized LCLs (8 of 
20). For five of these genes we also found that the AE 
observed in the LCLs is mediated by heritable rather 
than epigenetic mechanisms {PRIM2, CPA4, DLGAP2, 
ZNF215 and GABRG3). Given the fact that CPA4 is 
found to be heritable in LCLs but imprinted in fibro- 
blasts, further study of the two cell lines could help 
identify some of the factors involved in the mechanism 
of imprinting. Interestingly, another study found that 
CPA4 was imprinted in many fetal tissues but not in the 
fetal brain using pyrosequencing [38]. 

Several of the genes that were previously reported as 
imprinted (with consistent parent-of-origin transmission) 
were not confirmed in our study. In line with the litera- 
ture, many of these are thought to be tissue-specific. For 
example, the gene KCNK9 is clearly imprinted but it is 
only highly expressed in the central nervous system and 
the cerebellum [39] and, as expected, shows no imprint- 
ing in LCLs and fibroblasts. The same thing can be said 
for the genes PHLDA2 and OSBPLS, which are imprinted 
in the placenta [40,41], and the genes UBE3A and 
GRB10, which are imprinted in the brain [42,43]. Based 
on the fact that we were able to validate 59% of the genes 
as having consistent parent-of-origin transmission 
compared to 8% validated as not having consistent 
parent-of-origin transmission, genes with inconsistent 
parent-of-origin transmission are more likely to be false 
positives. 

Our data show conclusive evidence of imprinting for a 
few additional RefSeq genes (NAT15 and SGK2) as well 
as for three genes previously found imprinted in mice 
but not validated in humans (ZDBF2, RTL1 and MEG8) 
(Table 2). The NAT15 and SGK2 genes both lie adjacent 
to previously confirmed imprinted genes: ZNF597 and 
L3MBTL, respectively. 

Our genome-wide analysis of unannotated regions 
revealed evidence of imprinting for four additional 
regions (Figure 2), all of which were identified in the 
fibroblasts. Three of these regions span multiple genes. 
In addition, we discovered four new genes with moder- 
ate imprinting (TRAPPC9, ADAM23, CHD7 and 
TTPA), all of which showed paternal expression. The 
observation of partial imprinting for TRAPPC9 is nota- 
ble and should be studied in brain since this gene has 
recently been shown to be mutated in autosomal 
recessive mental retardation [44-46]. Consequently, if 
imprinting or partial imprinting can be replicated in 
human brain, paternally transmitted loss-of-function 
mutations could be enriched among individuals with 
intellectual disability. 
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This is the first genome-wide survey of imprinting 
using human primary cells. The use of human fibro- 
blasts to uncover new imprinted genes and regions and 
to validate known imprinted genes was more efficient 
than the use of LCLs. Putatively, the epigenetic altera- 
tions upon immortalization and prolonged cell culture 
observed earlier [47] in LCLs can disrupt imprinted 
gene expression. To further study the true extent of 
imprinting, tissue-dependent expression of primary cells 
retrievable from blood (distinct cellular lineages com- 
pared to fibroblasts) should be pursued [48]. The overall 
coverage of suggested and established imprinted genes 
should represent adequate tissue sampling. We note 
that our ability to observe imprinting in approximately 
50% of known imprinted genes in the current study is 
not substantially lower than that reported by Gregg 
et al. [18] when studying multiple regions in developing 
mouse brain, where 47 of 72 of known and measured 
imprinted genes showed parent-of-origin-dependent 
expression. In contrast to this latter study and despite 
our high transcriptome coverage, we did not find wide- 
spread evidence of unknown classically imprinted genes 
or even partial imprinting in annotated or unannotated 
regions. One potential explanation for the difference in 
uncovering novel imprinted genes between our study 
and the study by Gregg et al. is that we required consis- 
tent parent-of-origin-dependent expression across a 
genomic region (three independent SNPs required) and 
most of the novel imprinting candidates observed in 
mice did not show consistent evidence across a tran- 
scriptional unit [18]. 

While the LCLs provide a less powerful cell system to 
study imprinting compared to primary fibroblasts, they 
offer the possibility to look for determinants of non- 
heritable allelic expression since the cells have reduced 
mosaicism and show an excess of extreme allelic expres- 
sion compared to primary cells [32]. Gimelbrant and 
colleagues [25] have shown in individually derived LCL 
clones that the extent of RME could be substantial, but 
the mechanisms involved in random allelic silencing 
have not been previously pursued on a genome-wide 
scale. Here we show directly that reversible methylation 
is one of the mechanisms involved in RME using a 
demethylating agent in two different sets of samples. 
We also suggest that the mechanisms underlying transi- 
ent methylation-mediated allelic silencing are not pri- 
marily involved in imprinting or heritable allelic 
expression since such loci were relatively underrepre- 
sented among loci showing allelic expression changes 
upon demethylation. 

Conclusions 

In our comprehensive genome-wide search for imprint- 
ing and non-heritable allelic expression in human we 



found relatively few new imprinted genes, at least in 
LCLs and fibroblasts. Our results also suggest that the 
false-positive rate among suggested imprinted genes 
without direct parent-of-origin expression is high. This 
is likely, in part, due to the high prevalence of heritable 
allelic expression we observed in many candidate 
regions in our survey as well as technical issues in 
measuring allelic expression in human samples using 
single-point assessment. The existence of widespread 
parent-of-origin-dependent allelic expression observed 
recently in mouse studies [18] was not directly 
addressed in our assessment as we required multiple 
consistent measurements across transcripts. Overall, this 
could point to less than 100 classically imprinted genes 
(accounting for some tissue specificity) in the human 
genome. To extend the human catalogue where imprint- 
ing is directly observed as we show here, we suggest that 
other primary cells retrievable by non-invasive means 
(allowing analyses in pedigrees) will likely be needed. 

Materials and methods 

Imprinted gene search 

Genes were selected from the imprinting catalogue 
maintained at the Catalogue of Parent of Origin Effects 
(University of Otago). Imprinted genes were categor- 
ized as having either consistent (44 genes selected) or 
inconsistent parent-of-origin transmission (13 genes 
selected). 

Samples and cell culture 

For the lymphoblast samples, a three-generation pedi- 
gree of Caucasian origin (CEPH family 1420) [32] along 
with newly generated AE profiles in a Caucasian (1463) 
as well as a Yoruban (Y117) parent-offspring trio were 
used. In addition, nine independent parent-offspring 
fibroblast trios to confirm parental influence in AE were 
utilized. Seven of the loci showing parent-of-origin 
effects in LCLs had previously been validated by inde- 
pendent AE measurements in a second pedigree (1444) 
[32]. All LCLs were obtained from Coriell (Camden, NJ, 
USA) and fibroblast cell lines were also obtained from 
Coriell and the McGill Cellbank (Montreal, QC, 
Canada). Details of the cell lines used can be found in 
Table S4 in Additional file 1. This study was approved 
by the local ethics committee (McGill University IRB). 

The HapMap immortalized LCLs were grown in T75 
flasks in IX RPMI 1640 Media (Invitrogen, Burlington, 
ON, Canada), with 2 mM L-glutamine, 15% fetal bovine 
serum and 1% (penicillin/ streptomycin) at 37°C with 5% 
C0 2 . Fibroblasts primary cell lines were grown in med- 
ium containing a-MEM (SigmaAldrich, Oakville, ON, 
Canada) supplemented with 2 mmol/1 L-glutamine, 
100 U/ml penicillin, 100 mg/ml streptomycin, and 10% 
fetal bovine serum (SigmaAldrich) at 37°C with 5% C0 2 . 
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At 70 to 80% confluence, the cells were harvested and 
stored at -70°C until RNA and DNA extraction. 

RNA and DNA extraction and cDNA synthesis 

Total RNA was extracted from cell lysates resuspended 
in 600 ml RLT lysis buffer using the RNeasy Mini Kit 
(Qiagen, Ontario, Canada). High RNA quality was 
confirmed for all samples using the Agilent 2100 Bio- 
Analyzer (Agilent Technologies, Mississauga, ON, 
Canada) and the concentrations were determined using 
Nanodrop ND-1000 (NanoDrop Technologies, Wilming- 
ton, DE, USA). A cDNA synthesis protocol was applied 
on the heteronuclear DNA, and allowed the measure- 
ment of unspliced primary transcripts. Approximately 
150 mg of total RNA was isolated, treated with 6 U 
DNase I and poly(A). The RNA was then enriched using 
the MicroPoly(A)Purist protocol (Ambion Inc., Streets- 
ville, ON, Canada). The first- and second-strand cDNA 
synthesis was carried out on 1 ug poly(A)-enriched 
RNA using random hexamers and second strand cDNA 
synthesis was performed using the Superscript Double- 
Stranded cDNA Synthesis Kit (Invitrogen). DNA was 
extracted from cell lysates resuspended in 200 ml phos- 
phate-buffered saline using the GenElute DNA Miniprep 
Kit (SigmaAldrich). Concentrations were determined 
using the Quant-iT PicoGreen kit (Invitrogen). 

Allelic expression analysis on HumanlM or 
Humanl M-Duo beadchips 

Approximately 200 ng of genomic DNA and a 50 to 
300 ng double-stranded cDNA sample were used for the 
parallel genotyping and AE analysis on the Illumina Infi- 
nium HumanlM or HumanlM-Duo SNP bead microar- 
ray as previously described [32]. The parallel assessment 
of gDNA and cDNA heterozygote ratios was carried out 
essentially as described earlier [32], but signal intensity 
normalization at heterozygous sites followed a slightly 
modified approach. For the AE analysis, we utilized the 
Xraw and Yraw signal intensities and since the variances 
in the two channels were not the same (that is, it is a 
function of total intensity from both channels), a nor- 
malization of the variation was performed to allow com- 
parison between gDNA and cDNA allele ratios. In this 
study, only the P ratio was normalized (Xraw/(Xraw + 
Yraw)) from heterozygous SNPs with a total intensity 
(Xraw + Yraw) higher than the threshold value of 1,000. 
The scatter plot of the P ratio against the logarithm 10 
scaled total intensity fits well with polynomial regression 
model (quadratic regression model). This model shows a 
better fit than the linear regression model that we 
employed earlier for normalization [32], which works 
well in higher intensity parts but poor in lower intensity 
parts in many samples. The normalization process can 
be briefly summarized into the following steps: step 1, 



the P ratio is calculated along with total intensity in 
loglO scale for all heterozygous SNPs; step 2, all data 
points with greater than 1,000 in total intensity are 
divided into 50 intensity bins; step 3, a fitted curve from 
the median P ratio in each bin is computed using a 
polynomial regression model (quadratic regression) y = 
blx + b2 x 2 + a, where y is the expected P ratio from 
the curve and x is the loglO scaled total intensity; step 
4, from the fitted curve, the expected P ratio based on 
total intensity is calculated; step 5, the final normalized 
P ratio equals (Pobs - Pexpected + 0.5). Following nor- 
malization, all median P ratio values in all intensity bins 
should be close, if not equal, to 0.5. Phasing of the gen- 
otypes in the trios were done using Beagle [49] and in 
the three-generation pedigree by Merlin [50]. 

Validation of imprinted genes and genomic regions 

Genes were considered to be imprinted if they had 
extreme AE with an average of more than 2.9-fold dif- 
ference (1 SD calculated from genome- wide population 
data) between the two alleles as well as observation of 
transmission of AE that is consistent with paternal or 
maternal imprinting. 

For novel imprinted genes and genomic regions, at 
least three consecutives SNPs needed to show extreme 
AE (> 2.9-fold) for them to be included in the analysis. 
For partial imprinted genes and regions, AE levels were 
required to fall within 2- to 2.9-fold average difference 
among all informative heterozygotes. Windows were cal- 
culated using a previously published method [32]. 

Validation of the Illumina Array was performed by 
measuring AE with normalized Sanger sequencing in 
LCL and fibroblast samples heterozygous for specific 
SNPs. Paired genomic DNA and cDNA from the sam- 
ples were amplified for a specific SNP, verified by 
agarose gel electrophoresis and sequenced with ABI 
Big Dye chemistry and capillary electrophoresis on an 
ABI 3730 sequencer (Applied Biosystems, Foster City, 
CA, USA). The relative allelic expression levels for 
each SNP were assessed with the Peak-Picker software 
[34] and allele ratios below 0.1 or above 10 were 
assigned a value of 0.1 or 10, respectively, as they 
represent monoallelic expression (indistinguishable 
from homozygous sites). Similarly, estimated allele 
ratios below 0.1 or above 10 from the Illumina 1M 
assay were also assigned these values as they do not 
significantly differ from the homozygote ratios in 
BeadChip genotyping. 

Heritability 

Variants showing extreme AE were assessed for herit- 
ability of the AE using population mapping data for the 
same cell type and for transmission compatible with 
Mendelian inheritance in the pedigrees. 
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Demethylation treatment 

Two lymphoblast cell lines (19099 and 19141) were 
treated with three concentrations (1, 5 and 10 uM) of 
the demethylating drug AZA every 24 hours for 3 days. 
For these treatment groups, the viability was 73%, 69% 
and 68%, respectively. We chose to use a concentration 
of 5 uM for treatment studies in these two cell lines. 
A third LCL (12892) was treated with 10 M AZA for 5 
and 10 days. Total RNA was collected and prepared for 
genome-wide AE analysis at each time point and in 
untreated controls as described above. 

To confirm demethylation, we also collected DNA in 
untreated and treated states from 12892. We combined the 
5- and 10-day treatment groups as there was insufficient 
DNA for the 10-day group alone. We fragmented 10 ug of 
DNA by mixing it with TE buffer and nebulization buffer 
placed in a nebulizer cup. Forty-five psi of nitrogen was 
passed through the nebulizer cup for 1 minute in order to 
fragment the DNA. The DNA was then purified using a 
Qiagen MiniElute PCR Purification kit (Qiagen). Qiagen's 
buffer PBI was added and it was passed through a spin col- 
umn, then PE was passed through the column, then buffer 
EB to elute the DNA. Next was an AMPure bead purifica- 
tion step in order to isolate the appropriate size fragments 
required (over 1,000 bp). Buffer EB and AMPure beads 
were added to the DNA. Then the beads were collected 
using a magnetic particle concentrator, washed with etha- 
nol and finally the DNA was eluted from the beads using 
buffer EB. 

A methyl collector version Bl (Active Motif, Carlsbad, 
CA, USA) was used to isolate methylated CpG islands 
from fragmented genomic DNA according to the manu- 
facturer's protocol in order to verify demethylation of 
the DNA upon AZA treatment. In the first step, 1 ug of 
DNA was mixed with His-MBD2b protein, along with 
the binding buffer provided and magnetic beads to cap- 
ture the protein-DNA complex. Next, the beads were 
collected by the magnetic particle concentrator, the 
beads were washed with more binding buffer, and finally 
the beads were collected again and the supernatant dis- 
carded. Lastly, the methylated fragments were recovered 
by incubating the solution with the provided elution 
buffer. 

Transmission analyses 

Transmission patterns from parent to offspring for AE 
loci were assessed in the above-mentioned families (two 
LCL CEPH families, one LCL Caucasian trio, one LCL 
Yoruba trio and nine fibroblasts trios). Patterns consis- 
tent with imprinting were observed when the overex- 
pressed allele always came from the same parent 
regardless of which allele was associated with overex- 
pression in the parent. 



Population mapping data 

Mapping of heritable AE traits in CEU LCLs has been pre- 
viously reported by us [32]. For the fibroblasts, a similar 
approach for population mapping was employed, using 64 
unrelated primary fibroblasts from parent-offspring trios 
(most of the children were only analyzed for genotypes in 
DNA in order to phase the parental allelic expression 
data). These parental samples were phenotypically normal 
donors of Caucasian origin. The genome-wide mapping of 
AE in primary fibroblasts will be reported separately. 

Additional material 



Additional file 1: Tables SI, S2, S3, and S4 Tables of loci not 
imprinted, uninformative loci or of loci used in the validation as well as a 
description of LCL and fibroblast samples. 

Additional file 2: Figure SI. Figure demonstrating the correlation of AE 
between normalized Sanger sequencing and the lllumina array. 

Additional file 3: Tables S5 and S6. Candidate windows in LCLs and 
fibroblasts showing high allelic expression. 

Additional file 4: Figure S2. Figure demonstrating four loci showing 
imprinted expression. 
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